Search CORE

20 research outputs found

Gradient methods for problems with inexact model of the objective

Author: A d’Aspremont
B Polyak
G Chen
G Peyré
H Lu
JD Benamou
L Kantorovich
O Devolder
P Dvurechensky
P Ochs
R Sinkhorn
R Tappenden
T Gouic Le
Y Nesterov
Y Nesterov
Publication venue
Publication date: 01/01/2019
Field of study

We consider optimization methods for convex minimization problems under inexact information on the objective function. We introduce inexact model of the objective, which as a particular cases includes inexact oracle [19] and relative smoothness condition [43]. We analyze gradient method which uses this inexact model and obtain convergence rates for convex and strongly convex problems. To show potential applications of our general framework we consider three particular problems. The first one is clustering by electorial model introduced in [49]. The second one is approximating optimal transport distance, for which we propose a Proximal Sinkhorn algorithm. The third one is devoted to approximating optimal transport barycenter and we propose a Proximal Iterative Bregman Projections algorithm. We also illustrate the practical performance of our algorithms by numerical experiments

arXiv.org e-Print Archive

Crossref

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics

Repositorium für Naturwissenschaften und Technik

Truncated Inference for Latent Variable Optimization Problems: Application to Robust Estimation and Learning

Author: AL Yuille
AP Dempster
B Scellier
C Zach
CJ Wu
CL Byrne
D Geman
DP Bertsekas
DR Hunter
F Scarselli
FJ Pineda
J Mairal
K Lange
M Razaviyayn
P Dvurechensky
PF Felzenszwalb
RM Neal
S Agarwal
SJ Wright
X Xie
Y LeCun
ZQ Luo
Publication venue
Publication date: 01/01/2020
Field of study

Optimization problems with an auxiliary latent variable structure in addition to the main model parameters occur frequently in computer vision and machine learning. The additional latent variables make the underlying optimization task expensive, either in terms of memory (by maintaining the latent variables), or in terms of runtime (repeated exact inference of latent variables). We aim to remove the need to maintain the latent variables and propose two formally justified methods, that dynamically adapt the required accuracy of latent variable inference. These methods have applications in large scale robust estimation and in learning energy-based models from labeled data.Comment: 16 page

arXiv.org e-Print Archive

Crossref

Chalmers Research

Advances in low-memory subgradient optimization

Author: A Beck
A Beck
A Beck
A Ben-Tal
A Chambolle
A Demyanov
A d’Aspremont
A Gasnikov
A Juditsky
A Nedić
A Nemirovski
A Nemirovskii
A Nemirovsky
A Tyurin
B Polyak
B Polyak
BT Polyak
D Newman
G Lan
JC Duchi
L Blum
LG Khachiyan
M Cuturi
M Cuturi
N Shor
N Shor
NZ Shor
O Devolder
P Dvurechensky
R Brent
R Rockafellar
S Bubeck
Y Chen
Y Chen
Y Drori
Y Nesterov
Y Nesterov
Y Nesterov
Y Nesterov
Y Nesterov
Publication venue
Publication date: 05/02/2019
Field of study

One of the main goals in the development of non-smooth optimization is to cope with high dimensional problems by decomposition, duality or Lagrangian relaxation which greatly reduces the number of variables at the cost of worsening differentiability of objective or constraints. Small or medium dimensionality of resulting non-smooth problems allows to use bundle-type algorithms to achieve higher rates of convergence and obtain higher accuracy, which of course came at the cost of additional memory requirements, typically of the order of n2, where n is the number of variables of non-smooth problem. However with the rapid development of more and more sophisticated models in industry, economy, finance, et all such memory requirements are becoming too hard to satisfy. It raised the interest in subgradient-based low-memory algorithms and later developments in this area significantly improved over their early variants still preserving O(n) memory requirements. To review these developments this chapter is devoted to the black-box subgradient algorithms with the minimal requirements for the storage of auxiliary results, which are necessary to execute these algorithms. To provide historical perspective this survey starts with the original result of N.Z. Shor which opened this field with the application to the classical transportation problem. The theoretical complexity bounds for smooth and non-smooth convex and quasi-convex optimization problems are briefly exposed in what follows to introduce to the relevant fundamentals of non-smooth optimization. Special attention in this section is given to the adaptive step-size policy which aims to attain lowest complexity bounds. Unfortunately the non-differentiability of objective function in convex optimization essentially slows down the theoretical low bounds for the rate of convergence in subgradient optimization compared to the smooth case but there are different modern techniques that allow to solve non-smooth convex optimization problems faster then dictate lower complexity bounds. In this work the particular attention is given to Nesterov smoothing technique, Nesterov Universal approach, and Legendre (saddle point) representation approach. The new results on Universal Mirror Prox algorithms represent the original parts of the survey. To demonstrate application of non-smooth convex optimization algorithms for solution of huge-scale extremal problems we consider convex optimization problems with non-smooth functional constraints and propose two adaptive Mirror Descent methods. The first method is of primal-dual variety and proved to be optimal in terms of lower oracle bounds for the class of Lipschitz-continuous convex objective and constraints. The advantages of application of this method to sparse Truss Topology Design problem are discussed in certain details. The second method can be applied for solution of convex and quasi-convex optimization problems and is optimal in a sense of complexity bounds. The conclusion part of the survey contains the important references that characterize recent developments of non-smooth convex optimization

arXiv.org e-Print Archive

Crossref

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics

Repositorium für Naturwissenschaften und Technik

КРЕМНИЙ-ГЕРМАНИЕВЫЕ ПРИБОРНЫЕ НАНОСТРУКТУРЫ ДЛЯ ПРИМЕНЕНИЯ В ОПТОЭЛЕКТРОНИКЕ

Author: A. Dvurechensky V.
A. Karotki V.
A. Mudryi V.
F. Mofidnahai
P. Novikov L.
V. Volodin A.
Zh. Smagina V.
А. Двуреченский В.
А. Короткий В.
А. Мудрый В.
В. Володин А.
Ж. Смагина В.
П. Новиков Л.
Ф. Мофиднахаи
Publication venue: BNTU
Publication date: 24/04/2015
Field of study

Influence of technological parameters (temperature of substrate, number of Ge layers, ion treatment) on optical properties of Si/Ge nanostructures with Ge quantum dots have been studied. The Raman scattering lines related to the Si-Si, Ge-Ge and Si-Ge vibration modes have been detected in the Raman spectra of Si/Ge nanostructures. A significant enhancement of intensity of luminescence band at 0.8 eV related with radiative recombination on Ge quantum dots is observed after hydrogen-plasma ion treatment of Si-Ge nanostructures. It is important for increasing of the luminescence quantum efficienty of devices on the base of Si nanolayer with Ge quantum dots.Исследовано влияние технологических параметров (температура подложки, количество слоев Ge, ионная обработка) на оптические свойства Si/Ge наноструктур с квантовыми точками Ge. В спектрах комбинационного рассеянии света Si/Ge наноструктур наблюдались линии, связанные с Si-Si, Ge-Ge и Si-Ge колебательными модами. Обработка Si/Ge наноструктур в плазме водорода приводит к изменению спектральной формы и значительному увеличению интенсивности полосы люминесценции в области энергии 0,8 эВ, связанной с излучательной рекомбинацией неравновесных носителей заряда (электронов, дырок) на квантовых точках Ge, что важно для повышения квантового выхода люминесценции приборных структур, создаваемых на основе нанослоев Si и квантовых точек Ge

Devices and Methods of Measurements (E-Journal) / Приборы и методы измерений

First-Order Methods for Convex Optimization

Author: Dvurechensky P.
Shtern S.
Staudigl M.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

First-order methods for solving convex optimization problems have been at the forefront of mathematical optimization in the last 20 years. The rapid development of this important class of algorithms is motivated by the success stories reported in various applications, including most importantly machine learning, signal processing, imaging and control theory. First-order methods have the potential to provide low accuracy solutions at low computational complexity which makes them an attractive set of tools in large-scale optimization problems. In this survey, we cover a number of key developments in gradient-based optimization methods. This includes non-Euclidean extensions of the classical proximal gradient method, and its accelerated versions. Additionally we survey recent developments within the class of projection-free methods, and proximal versions of primal dual schemes. We give complete proofs for various key results, and highlight the unifying aspects of several optimization algorithms

Self-Concordant Analysis of Frank-Wolfe Algorithms

Author: Daumé III Hal
Dvurechensky P.
Ostroukhov P.
Safin K.
Shtern S.
Singh Aarti
Staudigl M.
Publication venue: Proceedings of Machine Learning Research
Publication date: 01/01/2019
Field of study

Projection-free optimization via different variants of the Frank-Wolfe (FW), a.k.a. Conditional Gradient method has become one of the cornerstones in optimization for machine learning since in many cases the linear minimization oracle is much cheaper to implement than projections and some sparsity needs to be preserved. In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a self-concordant (SC) function having unbounded curvature, implying absence of theoretical guarantees for the existing FW methods. We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k) after k iterations. If the problem admits a stronger local linear minimization oracle, we construct a novel FW method with linear convergence rate for SC functions